470: feat: Make `Schema::denormalize` public by martin-augment · Pull Request #50 · martin-augment/avro-rs

martin-augment · 2026-02-19T10:08:06Z

470: To review by AI

coderabbitai · 2026-02-19T10:08:21Z

Walkthrough

The Schema type's denormalization functionality was refactored to improve the public API. A new public method denormalize was introduced that accepts a schemata array and manages internal state initialization. The previous logic was extracted into a new private helper method denormalize_inner that takes an additional HashSet parameter for tracking defined names. All internal recursive calls throughout the codebase were updated to invoke the new private helper instead of the original method. The independent_canonical_form method was updated to use the new public denormalize signature. Documentation comments were added to both the public method and private helper clarifying their purposes and behavior.

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch pr-470-2026-02-19-10-08-03

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-19T10:08:22Z

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the avro crate by exposing the Schema::denormalize method as a public API. This change provides users with direct control over removing external references from Avro schemas, facilitating the creation of self-contained schema definitions. The underlying denormalization logic was encapsulated in a private helper, ensuring a clean public interface while maintaining the integrity of the recursive process.

Highlights

Public API Exposure: The Schema::denormalize method has been made public, allowing external consumers to explicitly remove external references from Avro schemas.
Internal Refactoring: The core denormalization logic was refactored into a new private helper method, denormalize_inner, to manage internal state like defined_names and provide a cleaner public interface.
Documentation Update: Comprehensive documentation was added for the newly public denormalize method, detailing its purpose, parameters, and error handling.
Internal Call Adjustments: All internal recursive calls within the schema processing logic, including independent_canonical_form and various Schema enum variants, were updated to utilize the new denormalize_inner helper.

Changelog

avro/src/schema/mod.rs
- Made Schema::denormalize public.
- Renamed the internal denormalization logic to denormalize_inner.
- Updated independent_canonical_form to use the new public denormalize method.
- Added documentation for the public denormalize method.
- Adjusted internal recursive calls to use denormalize_inner.

Activity

The pull request was submitted for review by AI.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

augmentcode · 2026-02-19T10:09:52Z

🤖 Augment PR Summary

Summary: This PR makes schema denormalization a public API so callers can inline externally-referenced named schemas and work with a fully self-contained schema.

Changes:

Expose Schema::denormalize publicly and keep the recursive implementation in a private denormalize_inner.
Update independent_canonical_form to call the new public method.
Adjust internal recursive calls to use denormalize_inner to preserve the original behavior.

Technical Notes: Denormalization mutates the schema in place, tracks already-defined names to avoid infinite recursion, and returns a resolution error if any referenced schema is missing (potentially leaving a partially denormalized value).

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

claude · 2026-02-19T10:10:24Z

Code Review

This PR makes Schema::denormalize public by introducing a clean split into a public wrapper (denormalize) and a private recursive implementation (denormalize_inner). The approach is sound and the refactoring of independent_canonical_form to delegate to the new public method is correct.

Issues

Typo in doc comment (avro/src/schema/mod.rs, line 401):

If your require a self contained schema...

Should be: "If you require..."

Broken/incorrect rustdoc link in # Errors section (denormalize doc, line 698):

/// Will return a [`Details::SchemaResolutionError`] if it fails to find

Details is not re-exported from the crate root — it only appears via Error::details() and Error::into_details(). The intra-doc link [Details::SchemaResolutionError] will fail to resolve in user-facing documentation. Consider either:

Using the full path: [`crate::error::Details::SchemaResolutionError`]
Or simplifying to: Will return [Err] if it fails to find a referenced schema (schema resolution error).

Partial mutation on error is a public API footgun:
The doc correctly warns that an error "will put the schema in a partly denormalized state", but for a &mut self public API, leaving self in an unspecified/broken intermediate state on error is surprising behavior. Consider documenting more explicitly what "partly denormalized" means and whether the schema is still valid/usable afterwards, or if callers should discard it. A note like "On error, the schema is left in an indeterminate state and should be discarded" would be clearer.

Minor Notes

Missing doc example: Other public methods in this file include a # Examples section. Adding one to denormalize would improve discoverability and serve as a usage guide for callers who want to use it directly rather than via independent_canonical_form.

Minor performance regression: The original code in independent_canonical_form passed HashSet::with_capacity(schemata.len()). The new denormalize uses HashSet::new() (no pre-allocation). This is unlikely to matter in practice (the set tracks named types, not all of schemata), but it's worth noting the pre-allocation hint is now lost.

Summary

The core approach is correct and the API design (hiding defined_names from callers) is the right call. The main blocking issues are the typo and the broken intra-doc link. The partial-state-on-error behavior deserves clearer documentation before stabilizing as a public API.

gemini-code-assist

Code Review

This pull request makes the Schema::denormalize function public, refactoring its implementation into a private denormalize_inner function for a cleaner API. However, a potential stack overflow vulnerability was identified in the recursive logic of denormalize_inner when handling cycles of schema references, which could lead to Denial of Service. It is recommended to implement cycle detection or a recursion depth limit. Additionally, a performance improvement is suggested for schema resolution within denormalize by using a HashMap for faster lookups.

gemini-code-assist · 2026-02-19T10:11:27Z

    pub fn independent_canonical_form(&self, schemata: &[Schema]) -> Result<String, Error> {
        let mut this = self.clone();
-        this.denormalize(schemata, &mut HashSet::with_capacity(schemata.len()))?;
+        this.denormalize(schemata)?;


This call to denormalize is vulnerable to a stack overflow if schemata contains a cycle of Schema::Ref objects. While the Avro parser prevents such cycles in schemas it parses from JSON, this function is public and can be called with manually constructed Schema objects that form a cycle of references.

value:good-but-wont-fix; category:bug; feedback: The Gemini AI reviewer is correct! But since this is a manually created Schema instance (in contrast to a parsed one) this is responsibility of the application developer to create a valid and usable Schema.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

avro/src/schema/mod.rs (1)
700-702: No direct unit test for the new pub fn denormalize

avro_rs_420_independent_canonical_form exercises denormalize only through independent_canonical_form. The new public method needs its own tests covering at minimum:

Successful resolution of external Refs (asserting no Schema::Ref remain in the output).

Error path: an unresolvable Ref returns Err(Details::SchemaResolutionError(...)).

Mutual recursion (A ↔ B) terminates and produces the correct partially-inlined form.

Do you want me to generate a test suite for Schema::denormalize covering these scenarios?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@avro/src/schema/mod.rs` around lines 700 - 702, Add direct unit tests for the
new public Schema::denormalize method (not only via independent_canonical_form).
Create tests that (1) construct schemas with external Ref entries and assert
denormalize returns Ok and that no Schema::Ref remain, (2) construct a schema
with an unresolvable Ref and assert denormalize returns
Err(Details::SchemaResolutionError(...)), and (3) create mutually recursive
schemas (A ↔ B) and assert denormalize terminates and produces the expected
partially-inlined structure; use denormalize (and denormalize_inner where
appropriate) to exercise resolution paths and mirror how
independent_canonical_form currently exercises it.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@avro/src/schema/mod.rs`:
- Line 401: Fix the typo in the doc comment for Schema::denormalize: change "If
your require a self contained schema..." to "If you require a self contained
schema..." so the sentence reads correctly and references Schema::denormalize;
update the comment near the Schema::denormalize documentation in mod.rs
accordingly.
- Around line 693-702: The public method denormalize currently calls
denormalize_inner in-place which can leave self partially modified on Err;
change denormalize to operate on a clone and only replace self on success: clone
self (use a Vec::with_capacity(schemata.len()) where you build temporary
schemata-related structures if applicable), call denormalize_inner on the cloned
instance, and if it returns Ok, swap or assign the cloned, denormalized value
back into self; if Err, return the error and leave the original self untouched.
Ensure you reference and use the existing denormalize, denormalize_inner, and
Schema abstractions when implementing the clone-and-swap approach.

---

Nitpick comments:
In `@avro/src/schema/mod.rs`:
- Around line 700-702: Add direct unit tests for the new public
Schema::denormalize method (not only via independent_canonical_form). Create
tests that (1) construct schemas with external Ref entries and assert
denormalize returns Ok and that no Schema::Ref remain, (2) construct a schema
with an unresolvable Ref and assert denormalize returns
Err(Details::SchemaResolutionError(...)), and (3) create mutually recursive
schemas (A ↔ B) and assert denormalize terminates and produces the expected
partially-inlined structure; use denormalize (and denormalize_inner where
appropriate) to exercise resolution paths and mirror how
independent_canonical_form currently exercises it.

coderabbitai · 2026-02-19T10:12:44Z

    /// Returns the [Parsing Canonical Form] of `self` that is self contained (not dependent on
    /// any definitions in `schemata`)
    ///
+    /// If your require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].


⚠️ Potential issue | 🟡 Minor

Typo in doc comment: "your require" → "you require"

📝 Fix

- /// If your require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize]. + /// If you require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// If your require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].

/// If you require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/schema/mod.rs` at line 401, Fix the typo in the doc comment for Schema::denormalize: change "If your require a self contained schema..." to "If you require a self contained schema..." so the sentence reads correctly and references Schema::denormalize; update the comment near the Schema::denormalize documentation in mod.rs accordingly.

value:good-to-have; category:typo; feedback: The CodeRabbit AI reviewer is correct! There is a minor typo in the documentation that should be fixed before merging the pull request.

coderabbitai · 2026-02-19T10:12:45Z

+    /// Remove all external references from the schema.
+    ///
+    /// `schemata` must contain all externally referenced schemas.
+    ///
+    /// # Errors
+    /// Will return a [`Details::SchemaResolutionError`] if it fails to find
+    /// a referenced schema. This will put the schema in a partly denormalized state.
+    pub fn denormalize(&mut self, schemata: &[Schema]) -> AvroResult<()> {
+        self.denormalize_inner(schemata, &mut HashSet::new())
+    }


⚠️ Potential issue | 🟠 Major

New public API leaves self in a partially-denormalized state on error

The doc comment mentions this, but exposing it publicly makes it a real footgun: a caller who handles the Err and then continues using self will operate on a partially-inlined schema tree (some Refs replaced, some not). Because this is now part of the public API, atomicity is worth the extra clone:

🛡️ Proposed atomic implementation

pub fn denormalize(&mut self, schemata: &[Schema]) -> AvroResult<()> { - self.denormalize_inner(schemata, &mut HashSet::new()) + let mut candidate = self.clone(); + candidate.denormalize_inner(schemata, &mut HashSet::with_capacity(schemata.len()))?; + *self = candidate; + Ok(()) }

This also restores the with_capacity(schemata.len()) hint that was dropped when the method was split (minor allocation efficiency).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/schema/mod.rs` around lines 693 - 702, The public method denormalize currently calls denormalize_inner in-place which can leave self partially modified on Err; change denormalize to operate on a clone and only replace self on success: clone self (use a Vec::with_capacity(schemata.len()) where you build temporary schemata-related structures if applicable), call denormalize_inner on the cloned instance, and if it returns Ok, swap or assign the cloned, denormalized value back into self; if Err, return the error and leave the original self untouched. Ensure you reference and use the existing denormalize, denormalize_inner, and Schema abstractions when implementing the clone-and-swap approach.

value:valid-but-wont-fix; category:bug; feedback: The CodeRabbit AI reviewer is correct that this will solve the issue with the partially denormalized schema. But a partially denormalised schema is still a valid schema. If the user does not want a partially denormalized schema then they can clone it before attempting to denormalize.

martin-augment · 2026-02-19T11:18:29Z

Typo in doc comment (avro/src/schema/mod.rs, line 401):
If your require a self contained schema...
Should be: "If you require..."

value:good-to-have; category:typo; feedback: The Claude AI reviewer is correct! There is a minor typo in the documentation that should be fixed before merging the pull request.

martin-augment · 2026-02-19T11:22:59Z

Broken/incorrect rustdoc link in # Errors section (denormalize doc, line 698):
/// Will return a [`Details::SchemaResolutionError`] if it fails to find
Details is not re-exported from the crate root — it only appears via Error::details() and Error::into_details(). The intra-doc link [Details::SchemaResolutionError] will fail to resolve in user-facing documentation. Consider either:

Using the full path: [`crate::error::Details::SchemaResolutionError`]

Or simplifying to: Will return [Err] if it fails to find a referenced schema (schema resolution error).

value:annoying; category:bug; feedback: The Claude AI reviewer is not correct! The documentation generates just fine! The Details struct is imported in the beginning of the file and this allows to use it in docstrings even if it is not a member of the crate's top-level exports

feat: Make Schema::denormalize public

476e9a8

augmentcode bot reviewed Feb 19, 2026

View reviewed changes

gemini-code-assist bot reviewed Feb 19, 2026

View reviewed changes

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

	/// If your require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].
	/// If you require a self contained schema including `default` and `doc` attributes, see [`denormalize`][Schema::denormalize].

Conversation

martin-augment commented Feb 19, 2026

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

gemini-code-assist bot commented Feb 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

augmentcode bot commented Feb 19, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 19, 2026

Code Review

Issues

Minor Notes

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

martin-augment commented Feb 19, 2026

Uh oh!

martin-augment commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Feb 19, 2026 •

edited

Loading